An Experimental Analysis of Consensus Tree Algorithms for Large-Scale Tree Collections

نویسندگان

  • Seung-Jin Sul
  • Tiffani L. Williams
چکیده

Consensus trees are a popular approach for summarizing the shared evolutionary relationships in a collection of trees. Many popular techniques such as Bayesian analyses produce results that can contain tens of thousands of trees to summarize. We develop a fast consensus algorithm called HashCS to construct large-scale consensus trees. We perform an extensive empirical study for comparing the performance of several consensus tree algorithms implemented in widely-used, phylogenetic software such as PAUP* and MrBayes. Our collections of biological and artificial trees range from 128 to 16,384 trees on 128 to 1,024 taxa. Experimental results show that our HashCS approach is up to 100 times faster than MrBayes and up to 9 times faster than PAUP*. Fast consensus algorithms such as HashCS can be used in a variety of ways, such as in real-time to detect whether a phylogenetic search has converged.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Hashing Algorithms to Summarize Large Collections of Evolutionary Trees

Different phylogenetic methods often yield different inferred trees for the same set of organisms. Moreover, a single phylogenetic approach (such as a Bayesian analysis) can produce many trees. Consensus trees and topological distance matrices are often used to summarize the evolutionary relationships among the trees of interest. These summarization techniques are implemented in current phyloge...

متن کامل

Comparison of Machine Learning Algorithms for Broad Leaf Species Classification Using UAV-RGB Images

Abstract: Knowing the tree species combination of forests provides valuable information for studying the forest’s economic value, fire risk assessment, biodiversity monitoring, and wildlife habitat improvement. Fieldwork is often time-consuming and labor-required, free satellite data are available in coarse resolution and the use of manned aircraft is relatively costly. Recently, unmanned aeria...

متن کامل

A Compressed Format for Collections of Phylogenetic Trees and Improved Consensus Performance

Phylogenetic tree searching algorithms often produce thousands of trees which biologists save in Newick format in order to perform further analysis. Unfortunately, Newick is neither space efficient, nor conducive to post-tree analysis such as consensus. We propose a new format for storing phylogenetic trees that significantly reduces storage requirements while continuing to allow the trees to b...

متن کامل

A Metaheuristic Algorithm for the Minimum Routing Cost Spanning Tree Problem

The routing cost of a spanning tree in a weighted and connected graph is defined as the total length of paths between all pairs of vertices. The objective of the minimum routing cost spanning tree problem is to find a spanning tree such that its routing cost is minimum. This is an NP-Hard problem that we present a GRASP with path-relinking metaheuristic algorithm for it. GRASP is a multi-start ...

متن کامل

A Fuzzy Decision-Making Methodology for Risk Response Planning in Large-Scale Projects

Risk response planning is one of the main phases in the project risk management and has major impacts on the success of a large-scale project. Since projects are unique, and risks are dynamic through the life of the projects, it is necessary to formulate responses of the important risks. The conventional approaches tend to be less effective in dealing with the impreciseness of risk response p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009